Direct F0 control of an electrolarynx based on statistical excitation feature prediction and its evaluation through simulation
نویسندگان
چکیده
An electrolarynx is a device that artificially generates excitation sounds to enable laryngectomees to produce electrolaryngeal (EL) speech. Although proficient laryngectomees can produce quite intelligible EL speech, it sounds very unnatural due to the mechanical excitation produced by the device. To address this issue, we have proposed several EL speech enhancement methods using statistical voice conversion and showed that statistical prediction of excitation parameters, such as F0 patterns, was essential to significantly improve naturalness of EL speech. In these methods, the original EL speech is recorded with a microphone and the enhanced EL speech is presented from a loudspeaker in real time. This framework is effective for telecommunication but it is not suitable to face-to-face conversation because both the original EL speech and the enhanced EL speech are presented to listeners. In this paper, we propose direct F0 control of the electrolarynx based on statistical excitation prediction to develop an EL speech enhancement technique also effective for face-to-face conversation. F0 patterns of excitation signals produced by the electrolarynx are predicted in real time from the EL speech produced by the laryngectomee’s articulation of the excitation signals with previously predicted F0 values. A simulation experiment is conducted to evaluate the effectiveness of the proposed method. The experimental results demonstrate that the proposed method yields significant improvements in naturalness of EL speech while keeping its intelligibility high enough.
منابع مشابه
An Evaluation through Simulation of Electrolarynx Control based on Statistical F0 Prediction for Multiple Speakers
An electrolarynx is a device that artificially generates excitation sounds to produce electrolaryngeal (EL) speech. Although proficient laryngectomees can produce intelligible EL speech by using this device, it sounds quite unnatural due to the mechanical excitation. To address this issue, we have proposed several EL speech enhancement methods using statistical voice conversion and showed that ...
متن کاملAn inter-speaker evaluation through simulation of electrolarynx control based on statistical F0 prediction
An electrolarynx is a device that artificially generates excitation sounds to produce electrolaryngeal (EL) speech. Although proficient laryngectomees can produce intelligible EL speech by using this device, it sounds quite unnatural due to the mechanical excitation. To address this issue, we have proposed several EL speech enhancement methods using statistical voice conversion and showed that ...
متن کاملA Vibration Control Method of an Electrolarynx Based on Statistical F0 Pattern Prediction
This paper presents a novel speaking aid system to help laryngectomees produce more naturally sounding electrolaryngeal (EL) speech. An electrolarynx is an external device to generate excitation signals, instead of vibration of the vocal folds. Although the conventional EL speech is quite intelligible, its naturalness suffers from the unnatural fundamental frequency (F0) patterns of the mechani...
متن کاملReal-time vibration control of an electrolarynx based on statistical F0 contour prediction
An electrolarynx is a speaking aid device to artificially generate excitation sounds to help laryngectomees produce electrolaryngeal (EL) speech. Although EL speech is quite intelligible, its naturalness significantly suffers from the unnatural fundamental frequency (F0) patterns of the mechanical excitation sounds. To make it possible to produce more naturally sounding EL speech, we have propo...
متن کاملPhysically Constrained Statistical F0 Prediction for Electrolaryngeal Speech Enhancement
Electrolaryngeal (EL) speech produced by a laryngectomee using an electrolarynx to mechanically generate artificial excitation sounds severely suffers from unnatural fundamental frequency (F0) patterns caused by monotonic excitation sounds. To address this issue, we have previously proposed EL speech enhancement systems using statistical F0 pattern prediction methods based on a Gaussian Mixture...
متن کامل